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Abstract — Linking network flows is an important problem in intrusion 
detection as well as anonymity. Passive traffic analysis can link flows 
but requires long periods of observation to reduce errors. Active traffic 
analysis, also known as flow watermarking, allows for better precision 
and is more scalable. Previous flow watermarks introduce significant 
delays to the traffic flow as a side effect of using a blind detection 
scheme; this enables attacks that detect and remove the watermark, 
while at the same time slowing down legitimate traffic. We propose the 
first non-blind approach for flow watermarking, called RAINBOW, that 
improves watermark invisibility by inserting delays hundreds of times 
smaller than previous blind watermarks, hence reduces the watermark 
interference on network flows. We derive and analyze the optimum 
detectors for RAINBOW as well as the passive traffic analysis under 
different traffic models by using hypothesis testing. Comparing the de- 
tection performance of RAINBOW and the passive approach we observe 
that both RAINBOW and passive traffic analysis perform similarly good 
in the case of uncorrelated traffic, however, the RAINBOW detector 
drastically outperforms the optimum passive detector in the case of 
correlated network flows. This justifies the use of non-blind watermarks 
over passive traffic analysis even though both approaches have similar 
scalability constraints. We confirm our analysis by simulating the detec- 
tors and testing them against large traces of real network flows. 

Index Terms — Traffic analysis, flow watermarking, non-blind water- 
marking, hypothesis testing. 



1 Introduction 

Internet attackers commonly relay their traffic through 
a number of (usually compromised) hosts in order to 
hide their identity. Detecting such hosts, called stepping 
stones, is therefore an important problem in computer 
security. The detection proceeds by finding correlated 
flows entering and leaving the network. Traditional ap- 
proaches have used patterns inherent in traffic flows, 
such as packet timings, sizes, and counts, to link an 
incoming flow to an outgoing one (H, |2l, |3l, (H, |5l. 
More recently, an active approach called watermarking 
has been considered [6|, [7J. In this approach, traffic 
characteristics of an incoming flow are actively per- 
turbed as they traverse some router to create a distinct 
pattern, which can later be recognized in outgoing flows. 
These techniques also have relevance to anonymous 
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communication, as linking two flows can be used to 
break anonymity, and both passive traffic analysis fS), 
f9) and active watermarking ilOl , |Tlj. |T2] have been 
studied in that domain as well. 

The choice between passive and active techniques for 
traffic analysis exhibits a tradeoff. Passive approaches 
require observing relatively long-lived network flows, 
and storing or transmitting large amounts of traffic char- 
acteristics. Watermarking approaches are more efficient, 
with shorter observation periods necessary. They are 
also blind: rather than storing or communicating traffic 
patterns, all the necessary information is embedded in 
the flow itself. This, however, comes at a cost: to en- 
sure robustness, the watermarks introduce large delays 
(hundreds of milliseconds) to the flows, interfering with 
the activity of benign users, and making them subject to 
attacks IlII, IITTl . 

Motivated by this, we propose a new category for 
network flow watermarks, the non-blind flow watermarks. 
Non-blind watermarking lies in the middle of passive 
techniques and (blind) watermarking techniques: similar 
to passive techniques (and unlike blind watermarks), 
non-blind watermarks will record traffic pattern of in- 
coming flows and correlate them with outgoing flows. 
On the other side, similar to blind watermarks (and 
unlike passive techniques), non-blind watermarking aids 
traffic analysis by applying some modifications to the 
communication patterns of the intercepted flows. We 
develop and prototype the first non-blind flow water- 
mark, called RAINBOW. RAINBOW records the timing 
pattern of incoming flows and correlate them with the 
timing pattern of the outgoing flows. On each incoming 
flow, RAINBOW also inserts a watermark by delaying 
some packets, after recording the received timings. As 
such a watermark is generated independently of the 
flows, this will diminish the effect of natural similarities 
between two unrelated flows, and allow a flow linking 
decision to be made over a much shorter time period. 
RAINBOW uses spread-spectrum techniques to make 
the delays much smaller than previous work. RAINBOW 
uses delays that are on the order of only a few millisec- 
onds; this means that RAINBOW watermarks not only 
do not interfere with traffic patterns of normal users, 
they are also virtually invisible, since the delays are of 
the same magnitude as natural network jitter. In [15J we 
use different information theoretical tools to verify the 
invisibility of RAINBOW, and demonstrate its high per- 



formance in linking network ilows through a prototype 
implementation over the PlanetLab |16| infrastructure. 

In this paper, we thoroughly analyze the detection 
performance of RAINBOW non-blind watermark, and 
compare it with that of passive traffic analysis schemes. 
By using hypothesis testing mechanisms from the detec- 
tion and estimation theory 1(171 , we find the optimum 
detection schemes for RAINBOW as well as the optimum 
passive detectors under different models for network 
traffic. Modeling real-world network traffic is a compli- 
cated problem as it depends on many different param- 
eters; as a result, we only consider two extreme models 
of the network traffic: (1) independent flows where each 
flow is modeled as a Poisson process (traffic model A), 
and, (2) completely correlated flows where all flows are 
considered to have similar timing patterns (traffic model 
B). We assume that any real-world traffic model lies in 
the middle of these two extreme models. Our analysis 
leads to the following important conclusions: 

i) Non-blind watermarking always performs a better 
detection than passive traffic analysis. This is an es- 
sential result in motivating the use of non-blind wa- 
termarks over passive traffic analysis, since both have 
similar scalability constraints, i.e., both approaches have 
0{n) communication overheads and 0{n^) computation 
overheads flSl. Not that this point is not necessary 
(nor is always true) to motivate the use of traditional 
(blind) watermarks over passive traffic analysis, since 
blind watermarks provide much better scalability (i.e., 
0(1) communication overhead and 0{n) computation 
overhead [15] ). 

ii) Our analysis shows that the performance advantage 
of non-blind watermarking (over passive schemes) is 
only marginal for uncorrelated network traffic, while it 
is very significant for correlated network traffic. This 
knowledge can be used to decide the best traffic anal- 
ysis approach in various applications. We validate our 
analysis through simulating the detection schemes on 
real network traces. In particular, we show that for 
highly correlated traffic, e.g., same webpage downloads, 
passive traffic analysis performs very poorly while a 
RAINBOW watermark is highly effective. 

iii) We also show (through both analysis and exper- 
iments) that the optimum watermark detector derived 
for correlated traffic (namely SLCorr) also performs 
very good for uncorrelated traffic (while the optimum 
watermark detector for uncorrelated traffic does not 
do well for correlated traffic). This allows one to use 
SLCorr as the sole watermark detector regardless of the 
type of traffic being observed. This is especially useful 
in real-world applications where the observed traffic is 
a mixture of different flow types. 

Note that in this paper we do not discuss the per- 
formance advantage of non-blind watermarks over tra- 
ditional blind watermarks, as this has been justified in 

©]. 

The rest of this paper is organized as follows: we 
review the problem of stepping stone detection and 



existing schemes in Section |2l Our RAINBOW scheme is 
presented in Section [S] In Section SI we use hypothesis 
testing to find and analyze the optimum likelihood ratio 
detectors for passive and non-blind active (watermark) 
approaches under different traffic models, and analyze 
their false error rates. In Section |5l we validate the anal- 
ysis results through simulation of the detection schemes 
over real network traces. Finally, the paper is concluded 
in Section [6l 

2 Background 

In this section, we review the problem of detecting step- 
ping stones and then review both the passive and active 
approaches to the problem. We compare the advantages 
and disadvantages of the two techniques, motivating our 
approach. 

2.1 Stepping Stone Detection 

A stepping stone is a host that is used to relay traffic 
through an enterprise network to another remote des- 
tination. Stepping stones are used to disguise the true 
origin of an attack. Detecting stepping stones can help 
trace attacks back to their true source. Also, stepping 
stones are often indicative of a compromised machine. 
Thus detecting stepping stones is a useful part of enter- 
prise security monitoring. 

Generally, stepping stones are detected by noticing 
that an outgoing flow from an enterprise matches an 
incoming flow. Since the relayed connections are often 
encrypted (using SSH IITSl , for example), only character- 
istics such as packet sizes, counts, and timings are avail- 
able for such detection. And even these are not perfectly 
replicated from an incoming flow to an outgoing flow, as 
they are changed by padding schemes, retransmissions, 
and jitter. As a result, statistical methods are used to 
detect correlations among the incoming and outgoing 
flows. We next review the passive and active approaches. 

2.2 Passive Traffic Analysis 

In general, passive traffic analysis techniques operate by 
recording characteristics of incoming streams and then 
correlating them with outgoing ones. The right place to 
do this is often at the border router of an enterprise, so 
the overhead of this technique is the space used to store 
the stream characteristics long enough to check against 
correlated relayed streams, and the CPU time needed 
to perform the correlations. In a complex enterprise 
with many interconnected networks, a connection re- 
layed through a stepping stone may enter and leave the 
enterprise through different points; in such cases, there 
is additional communications overhead for transmitting 
traffic statistics between border routers. 

The passive schemes have explored using various 
characteristics for correlating streams. Zhang and Pax- 
son [2J model interactive flows as on-off processes and 
detect linked flows by matching up their on-off behavior. 



Wang et al. (U focus on inter-packet delays, and consider 
several different metrics for correlation. More recently. 
He and Tong used packet counts for stepping stone 
detection ||T91 . 

Donoho et al. were the first to consider intruder eva- 
sion techniques |3|. They defined a maximum-tolerable- 
delay (MTD) model of attacker evasion and suggested 
wavelet methods to detect stepping stones while being 
robust to adversarial action. Blum et al. used a Poisson 
model of flows to create a technique with provable 
upper bounds on false positive rates [5J, given the MTD 
model. However, for realistic settings, their techniques 
require thousands of packets to be observed to achieve 
reasonable rates of false errors. 

2.3 Watermarks 

To address some of the efficiency concerns of passive 
traffic analysis, Wang et al. proposed the use of water- 
marks |6|. In this scenario, a border router will modify 
the traffic timings of the incoming flows to contain a 
particular pattern — the watermark. If the same pattern is 
present in an outgoing flow, a stepping stone is detected. 

Watermarks improve upon passive traffic analysis in 
two ways. First, by inserting a pattern that is uncorre- 
lated with any other flows, they can improve the de- 
tection efficiency, requiring smaller numbers of packets 
to be observed (hundreds instead of thousands) and 
providing lower false-positive rates (lO""* or lower, as 
compared to 10^^ with passive watermarks). Second, 
they can operate in a blind fashion: after an incoming 
flow is watermarked, there is no need to record or com- 
municate the flow characteristics, since the presence of a 
watermark can be detected independently. The detection 
is also potentially faster, as here is no need to compare 
each outgoing flow to all the incoming flows within the 
same time frame. 

Watermarking techniques for network flows have been 
based on existing techniques for multi-media water- 
marking. For example, Wang et al. based their scheme on 
QIM watermarks |20]. Two other watermark schemes [7|, 
| [TT| are based on patchwork watermarking [21]. and 
Yu et al. [12J developed one based on spread-spectrum 
techniques [22J. Some of the schemes target anonymous 
communication rather than stepping stones as the appli- 
cation area (both involve the problem of linking flows), 
but the techniques for both are comparable. 

2.4 Watermark Properties 

To motivate our design, we first propose some desirable 
properties of network flow watermarks. First of all, 
a watermark should be robust to modifications of the 
traffic characteristics that will occur inside an enterprise 
network, such as jitter. Watermarks should also be re- 
silient to an adversary who actively tries to remove them 
from the flow, a property we call active robustness. The 
watermarks should also introduce little distortion, in that 
they should not significantly impact the performance 



of the flows. This is important because in a stepping- 
stone scenario, most watermarked flows will be benign. 
Finally, watermarks should be invisible even to attackers 
who specifically try to test for their presence. 

Looking at previous designs, all of them fail to be 
invisible: the watermarks introduce large delays, on the 
order of hundreds of milliseconds, on some packets, 
which can be easily detected by an attacker fl3|. In 
fact, they cannot even be considered low-distortion, as 
such large delays are easily noticeable and bothersome 
to legitimate users. The watermarks are also not actively 
robust, as demonstrated by recent attacks [13J, [14J. 

We also observe that active robustness and invisibil- 
ity are likely to be impossible to achieve at the same 
time. This is because to be invisible, the watermark can 
only introduce minute changes to the packet stream. In 
particular, it cannot introduce jitter of more than a few 
milliseconds, since otherwise it will be possible to tell 
it apart from the natural network jitter. However, an 
active attacker will be willing to introduce large delays to 
the network; for example, the maximum tolerable delay 
suggested in previous work is 500ms. As such, he will 
be able to destroy any low-order effects that will be 
introduced by the watermark. 

Further, it is easy to imagine an attacker determined 
to hide his tracks using even more drastic measures, 
such as using dummy packets to generate a completely 
independent Poisson process |5l, which will render any 
linking techniques ineffective. As such, we decided to 
design a watermark scheme that is robust to normal 
network interference, though not actively robust, and is 
invisible. This will serve to detect stepping stones where 
attackers are unwilling (or unable) to actively distort 
their stream as it crosses a stepping stone. Further, as the 
watermark will be invisible, attackers will not be able to 
tell if they are being traced and thus will be less likely 
to try to apply costly watermark countermeasures. 

3 RAINBOW Watermark 

We next present the design of a new watermark scheme 
we call RAINBOW, for Robust And Invisible Non-Blind 
Watermark. Our scheme is robust (to passive inter- 
ference) and invisible. However, to achieve invisibility 
while maintaining detection efficiency, we make the 
scheme non-blind; that is, incoming flows timings are 
recorded and compared with the timings of outgoing 
flows. This allows us to make a robust watermark test 
with even low-amplitude watermarks. 

The RAINBOW watermark embedding process is 
shown in Figure [1] Suppose that a flow with the packet 
timing information {t"|i = l,..,n + 1} enters border 
router where it is to be watermarked (we use the su- 
perscript u to denote an "unwatermarked" flow). Be- 
fore embedding the watermark, the inter-packet delays 
(IPDs) of the flow, r" = tf^-^ — tf are recorded in an 
IPD database, which is accessible by the watermark 
detector. The watermark is subsequently embedded by 
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Fig. 1. Model of RAINBOW network flow watermarking 
system. 



delaying the packets by an amount such that the IPD 
of the ith watermarked packet is t^ = t" + Wi. The 
watermark components {wi}^^^ take values ±a with 
equal probability. The value a is chosen to be small 
enough so that the artificial jitter caused by watermark 
embedding is invisible to ordinary users and attackerqj. 

In order to apply watermark delays on the flow, 
output packet ti is delayed by wq + X]j=i ^i' where 
Wo is the initial delay applied to the first packet. This 
results in Tj"" = t" + Wi, as desired. Since we cannot 
delay a packet for a negative amount of time, wq must 
be chosen large enough to prevent this from happening. 
Since the sequence Wi is generated from a random seed, 
the watermarker can calculate all of the partial sums 
J2]=i ™i i^ advance and adjust wo accordingly. If a 
particular random seed requires a very large initial delay 
Wo, a different seed can be chosen. 

As the flow traverses the network, it accumulates extra 
delays. Let di be the delay that the packet accumulates 
by the time it reaches the watermark detector; i.e., the 
packet is received at the detector at time f- = tf + di. 
The IPD values at the detector are then: 



_ fr _ ,r _ u 



(1) 



where Si = di+i — di is the jitter present in the network. 
As mentioned before, the RAINBOW scheme is non- 
blind and therefore the detector has access to the IPD 
database where the unwatermarked flows are recorded. 
Given an observed flow at the detector with IPDs r'' 
and a previously recorded flow t", the detector must 
decide whether the two flows are linked or not. In the 
next section we derive the optimum datectors for the 
RAINBOW watermaks according to the LRT ruls. We 
also derive the optimum passive detectors, showing that 
the RAINBOW watermark performs significantly better 
than passive traffic analysis for correlated network flows. 

4 Detection approaches 

RAINBOW is the first non-blind flow watermarking 
scheme. Non-blind watermarking inherits similar scala- 
bility issues from the passive traffic analysis. In this sec- 
tion, we show how non-blind watermarking improves 

1. Throughout this paper, by attacker we mean the attacker to the 
watermarking scheme. 



the traffic analysis performance as compared to the 
traditional passive traffic analysis. 

We derive optimum Likelihood Ratio Test (LRT) de- 
tectors for the RAINBOW watermarking scheme for 
different traffic models, and compare its detection per- 
formance with those of optimum passive detectors. We 
show that RAINBOW outperforms passive traffic analy- 
sis for different traffic models; this confirms what we 
expect intuitively from information theory, as a non- 
blind watermark detector has access to more information 
(the watermark and the IPDs), compared to a passive 
detector which only has access to the IPDs. We also 
show that the RAINBOW detector is reliable in different 
models, while the optimum passive detector fails in 
some scenarios. 

As the extreme models, we perform our detection 
analysis for two traffic models: 

• traffic model A: independent flows with i.i.d. inter- 
packet delays, and, 

• traffic model B: completely-correlated flows. 

As it is infeasible to evaluate the detection perfor- 
mance for all different traffic models, we discuss the 
detection performance for these two traffic models, and 
consider any real-world network flow to lie between 
these two extreme models. We show that an active 
detector, i.e., RAINBOW, is reliable for different models, 
while a passive detector fails for certain traffic models. 

4.1 Detection primitives 

We use hypothesis testing tTTi to analyze the detection 
performance of active and passive detectors. For an 
active detector, we aim to distinguish between the two 
following hypotheses: 

• Ho {null hypothesis): the received flow with IPDs x'' 
is a new, unwatermarked flow, unlinked to the flow 
with IPDs T, and, 

• Hi : -r'' is the result of a flow with original IPDs r be- 
ing watermarked and passed through the network^J 

Also, for a passive detector we consider the following 
hypothesis testing problem: 

• Ho {null hypothesis): the received flow with IPDs r'' 
is a new flow, unlinked to r (the IPDs of another 
received flow), and, 

• Hi: t'' is the result of r passing through the net- 
work. 

We find the optimum likelihood-ratio tests (LRT) of 
these hypothesis testing problems. For any received flow 
with T^ IPDs, an LRT test evaluates a test metric for the 
IPDs, r[T''], and compares it with a detection threshold 
rj; if T[t^'] > 77, the received flow is said to be linked to 
the one in the detector's database (with IPDs of r). We 

1. Note that there is another possibility, namely that t^ is a water- 
marked flow, but not corresponding to t. However, we ignore this case 
because errors in this scenario do not matter: if the flow is said to be 
watermarked, then the detection algorithm is correct, and if it is said 
to be unwatermarked, it will later be tested against the correct t. 
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can therefore express the false positive and false negative 
rates of the detector as: 



Pfp = P{T[t^\Ho]>v) 

PpN = P(T[rni?i] < 77) 



(2) 
(3) 



4.2 Network jitter model 

We will model network delays as i.i.d. exponential, 
which implies that the jitter (difference of two delays) 
is i.i.d. according to a zero-mean Laplace distribution 
denoted by Lap{0,bs), where 26^ is the variance of the 
jitter. Of course, in a real network, delays will have some 
correlation; we compare the probability density function 
(PDF) of real observed jitter on a connection over Planet- 
Lab \lS\ with a best-fit Laplace distribution in Figure |2l 
We can see that the real PDF has greater support at 0, and 
the Laplace distribution has a heavier tail. This means 
that our analysis of error rates will be conservative, since 
jitter will result in no error for our detection scheme. 
We have also conducted similar experiments with the 
same results on Tor anonymous network f^S] to consider 
the other application of watermarking. 

4.3 Traffic model A: independent flows, i.i.d. IPDs 

In this model, we assume that the candidate flows are 
independent. Also, each flow has i.i.d. IPDs, i.e., the flow 
is modeled with a Poisson process. This represents a 
good model for non-interactive network flows. 

4.3. 1 Passive detection (PASSV sciieme) 
In this section, we find the optimum likelihood ratio 
(LRT) passive detector for the traffic model A. Suppose 
that the flow with IPDs r is known to the detector. 
Detector will need to check if it is correlated with some 
received flow r*, where t and t* are independent. So, 
in this case the hypothesis testing problem is: 



= T, + 5} 



(4) 



where (5" and 6^ represent the network jitter. Based on 
our measurements over the Planetlab we model the net- 
work jitter with an i.i.d. Laplacian distribution Lap{0,b) 
(see Section |4T" 



In order to find the optimum LRT detector, we first 
need to find the PDF function of tI[ in different hy- 
potheses, i.e., pi{-) for hypothesis Hi . As the model 
A suggests, we model the IPDs r* as i.i.d. exponential 
distribution. So, in hypothesis Hq the received signal 
t[ is the summation of a Laplacian and an exponential 
random variable; we use Lemma |3] in Appendix |A] to 
find po{-): 
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2(Ab+l)' 

In the case of Hi, since the r^ is known to the detector, 
we can model t^ as a Laplacian distribution with mean 
Ti. So: 






(6) 



Note that even though the real-world IPDs can never 
be negative, the densities po and pi return a non-zero 
density for negative values of the IPDs. In fact, this is due 
to the approximation we make in modeling the network 
jitter as a two-sided Laplacian distribution, and its effect 
is very small for ordinary network flows based on our 
simulations [15|. 

Fiaving the densities po and pi, we derive the optimum 
detector based on the likelihood ratio test to be: 



where r; is the LRT detection threshold and 



HO 



Poirf) 



(7) 

(8) 
(9) 



We define 77,1 = rj/n as the normalized detection tfiresliold. 
A value of of ?7„ = results in a MiniMax detector. 

4.3.1.1 Detection performance: Let us consider the 
case where the detector uses the PASSV detection scheme 
in order to link a received flow with IPDs x"" to a known 
flow with IPDs T, i.e., a registered flow. Considering the 
assumptions made in the traffic model A, i.e., the IPDs 
being i.i.d., we use Lemma [T] (part b) in Appendix |A] to 
find the false positive (Pfp) and false negative (Pfn) 
error rates of the PASSV detector: 



Pfp <[|e-("''"-'^o'-("» 

n 

Pfn < Jle-""-^^''"-''?-^")) 



i=l 



where < s < 1 and: 



Po"^«)Pi«)d< 



(10) 

(11) 

(12) 



The error probabilities of Pf^ and Ppp correspond to 
a fixed known IPDs sequence, r. The overall false errors 
are evaluated by averaging Ppp and Ppj^ with respect 
to t: 



Pfp = E^Pfp} (13) 

n 

< ]^£:^7e-('*''""''o^('*»} (14) 

)n 
(15) 

Pfn = E^{P-^J^] (16) 

n 

< [|^^, |e"«'"^)''""^o^(')^| (17) 



find the distribution of t^ in different hypotheses. Using 
Lemma |3] in Appendix |B] we find the corresponding PDF 
function under Hq as: 
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(26) 



Since r^ and Wi are known to the detector, we find the 
PDF in hypothesis Hi as the following: 



Pi«) = ^e- 



(27) 



i=l 



-((.-l)^„^Poi,(s));^g-Ari^^ \ (18) 



We can represent the upper bounds of these false 
errors as: 



Pfp < e""--^^^^^'''") 
Pfn < e-"-^^"^"'''-' 



(19) 
(20) 



where 

Epp 



E 



{s,r]n) = ~\n( f e-('*''"-''«.i("»Ae-^"idTi') (21) 

(s, r/„) = - In ( / e-«^-i)''"-''«'i("»Ae-^"i(iTi') (22) 

(0 < s < 1) 

For each detection threshold rjn, ^ve find the tightest 
exponent bounds E*pp{rin) and Epj^{r]n) such that: 



So, the optimum detector based on the likelihood ratio 
test is: 

L{Tn ^Zl e" (28) 
where r] is the LRT detection threshold and 

L{t^) = n^^«) (29) 

L^iO = ^if^ (30) 

4.3.2.1 Detection performance: As before, consid- 
ering the independence of the IPDs and also the water- 
mark bits we use Lemma [1] (part b) in Appendix |A] to 
find the error probabilities of the ACTV detector for a 
given T and w: 



p;)^<ne-("''"-^"-""("» 



EppiVn) = max EFp{s,r]n) 

0<s<l 

E*FN{Vn) = max ^_FAr(s, 77„) 
0<s<l 



(23) 
(24) 
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(31) 
(32) 



4.3.1.2 Analysis results: We use Mathematica 7.0 to 
evaluate the false error exponents of l|23) and (|24] l. The 
parameters used for the simulations are b = lO^^sec and 
A — bpps, borrowed from [15J. Figure |3] plots the tightest 
bounds for the error exponents of E*pp{rin) and E*p^{r]n) 
for different thresholds of ry„. Note that the optimum s 
varies with the decision threshold. For ?]„ = the false 
positive and false negative errors are equal; we name this 
error rate as the Cross-Over Error Rate (COER). For the 
mentioned setting of the variables the COER exponent 
of the PASSV detector is equal to 1.06396. 

4.3.2 Active detection (ACTV scheme) 
In this section, we find the optimum LRT detector for the 
RAINBOW non-blind watermark for the traffic model A. 
We have the following hypothesis testing problem: 



where < s < 1, and: 

M5r(*) = \n I pl-^{Tl)p{{Tl)dTl (33) 



As Pp'^ and Pp'p correspond to a fixed IPDs se- 
quence T and the watermark w, we evaluate the overall 
false errors by averaging Pp'p and Pp'^J with respect to 
T and w: 



Ppp — E^Et{Ppp } 



(34) 
(35) 



= f- E /"e-(«''"-^orH3))Ae-^-MTi^ (36) 



Ho-.T^ 
Hi-.rl 



(25) 



PpN — E^Et{Ppj^ } 

<\{E^^Er^[e-^^'-^^''^--^o7(^))^ 



(37) 
(38) 



where t/s are the IPDs registered in the IPD database, 
and r * 's are the IPDs of an independent flow. As before, 
in order to find the optimum LRT detector we need to 






(39) 



The approximated upperbounds can be formulated as: 



Pfp < e-"--^^^("'''") 
Pfn < e-"-^^"^'*'''") 



(40) 
(41) 



where 



EFp{s,T]n) = -In 



2 ^^ 

^ iiJi— 



EFN{s,T]n) = "In 2 ^ 
\ ■u;i=0 



(42) 



g-((.-i)^„-p;i,'™i(-));^e-Ari^^^ ) (43) 



(0 < s < 1) 

Finally, the tightest bounds for each ?7„ are found 
by maximizing the error exponents with respect to the 
parameter s: 

(44) 

(45) 



EppiVn) = max EFp{s,r]n) 

0<s<l 



Ep]y{r]n) = max EFNis,r]n) 

0<s<l 

4.3.2.2 Analysis results: Using Mathematica 7.0 we 
evaluate the false error exponents of l|44t and | |45l . As 
before, we use the parameters b = 10~^sec, a = lO^^sec, 
and A = 5pps for the simulations. Figure |4] plots the 
tightest bounds for the error esponents of Epp{rjn) and 
Ep]^{r]n) for different thresholds of ry„. The COER expo- 
nent occurs for 77„ = and is equal to 1.06828, which is 
slightly better compared to that of the PASSV detector 
evaluated before, i.e., 1.06396. 

4.4 Traffic model B: correlated flows, correlated 
IPDs 

As the other extreme of traffic models we investigate 
the traffic model with correlated IPDs. We consider the 
case where all of the network flows have the same IPDs, 
e.g., for any two flows with IPDs r* and r we have that 
T* = Ti = Ci for all i. In particular, this model captures 
the behavior of a number of widely used traffic types, 
including file transfers, browsing the same websites, etc. 

4.4. 1 Passive detection 

In this model, a passive detection faces the following 

hypothesis testing problem: 

i Ho: t[ ^T*+6^ 
\ Hi: Tl = T^^b^ 



(46) 



where t* = t^ = Ci. The optimum LRT detector for this 
problem is the random guessing: 

L(t'')^ RND (47) 

where RND is a uniform random variable. The detection 
rule is: 



Lirn ^"hI e" 



(48) 



4.4.1.1 Detection performance: Since the detector 
is based on random guessing, the false errors are as 
followed: 

Pfp = P (49) 

Pfn =1-P (50) 

where < p < 1 is determined by the choice of tj. 

4.4.2 Active detection (SLCorr sciieme) 

In this case, we have the following hypothesis testing 
problem: 



Ho: T[^T*+d, 
Hi : tI ^ Ti + Wi + 6i 



(51) 



Since r* = r^ = Ci, this can be reduced to the 
following hypothesis testing: 



^0 : Vi = 5i 

Hi : Ui ^w^ + Si 



(52) 



where yi = t^ — Ti. The optimum LRT detector for this 
problem can be found considering the distribution of yi 
in different hypotheses: 



Ph{y^) = ^e- ^' 



1 



So, we can derive the LRT detection metric as: 

PiiVt) 



Li{yi) = 
which can be expressed as: 



PliVi) 



\a.Li{yi) 



-^{\yi\ - \yi -w^\ 



(53) 
(54) 

(55) 
(56) 



lf'^{y^-^)-sgn{w.) (57) 



f {■) is a soft-limiter with breakpoints at — | and +| 
(a is the watermark amplitude as defined before): 



f^^ix) 



-f <x<+f 

X < 



(58) 



2 -^ — 2 

We can reformulate the optimum detection rule as: 

>Hi „ (59) 

where 



n 

D{y)=Y,D^{y^ 



(60) 



and 



A(j/i) = -\nL,{y,) 



f^^ [Vi - y) -^aniwi) 



(61) 
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Fig. 3. Analytical error exponents E*pp{r]n) and E*p^{-qn) 
of the PASSV detection scheme for different values of ??„ 
(traffic model A). (6 = lO^^sec, A = bpps) 



Fig. 4. Analytical error exponents Epp{r]n) and E*p^{rin) of 
the ACTV detection scheme for different values of 77„ (traffic 
model A). (6 = lO^^sec, A = hpps) 
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Fig. 5. Block diagram of the SLCorr detection scheme. 



We call this detector SLCorr, as it is composed of 
a soft limiter followed by a correlation block. From a 
communications point of view, the soft-limiter is useful 
in reducing the signal detection noise in channels with a 
Laplacian distributed noise. We will use this as the de- 
tection scheme for the RAINBOW watermark, as would 
be discussed later. Figure |5] shows the block diagram of 
the SLCorr detector. SLCorr is a MiniMax detector for a 
detection threshold of rj = 0. 

4.4.2.1 Detection performance: The SLCorr test 
metric is given in ||59] | to l(6T] |. Let us define /g(-) and 
/{(■) as the PDF of Xi = yt — ^ in h5^othesis Hq and 
Hi, respectively. We have that: 



1 Ni+- 



m^^ 



26 



(62) 



(63) 



Based on these, we can evaluate po(') and pi(), namely 
the PDF of Di{yi) under hypothesis Hq and Hi, respec- 
tively: 



je b 



Po{Di 



J_e- 



D, 



'1 < A < f 



(64) 



D, 



Pi{Di) 



2fc" 



D, 



-2: <■ D < - 



(65) 



^e b 



D,= 



Considering that the distributions pa{Di) and pi{Di) 
are i.i.d. with i we use the Chernof bound (part (c) of 
Lemma [T] in Appendix |Aj to find the error probabilities 
of the SLCorr detector: 



Pfp < e-"(^''"-'^°(*)) (Vs > 0) 

Mo(s) ^ ^J.D,\Ho{s) 

Pfn < e-"(^''"-^i(^» (Vs < 0) 



(66) 



(67) 



where ?7„ = r//n is the normalized detection threshold. 
We have that: 



(68) 



Mo(s) = ^^Di\Hois) = In / e''''po{x)dx 





= ln 


s6 . ^a sb-2 

r^Tr ~ 1 r~ ~ 




[2(s6-l) 2(s6-l) \ 


and. 




Mi(s) 


= AiD,|ffi(s) = In / e'''pi{x)dx 

J -oo 




= ln 


sb a ,o sb + 2 ,a 

C~~C~ ~ 1 6 2 

2(s6+l) 2(s6+l) 



(69) 



We can express the above Ppp and Pfn false errors 



as: 



where 



Pfp < e~"-^^^("'''") 



EFp{s,'qn)^ srjn- IJ-ais) (s > 0) 

EFNis,r]n) = Sljn- flois) (s < 0) 



(70) 
(71) 

(72) 
(73) 
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Fig. 6. Analytical error exponents Epp{r/n) and E*pj^{rin) 
of SLCorr for different values of ry„ (traffic model B). {h = 
lO^^sec, a = lO^^sec) 



Finally, the tightest bounds for each rjn are found 
by maximizing error exponents with respect to the s 
parameter: 



: mSiyiEFpis,rin) 

s>0 
EpAfilln) = niax£'FAr(s,77„) 



E*FpiVn 



s<0 



(74) 
(75) 



4.4.2.2 Analysis results: We use Mathematica 7.0 to 
evaluate the false error exponents of |[74t and ((75] l. The 
parameters used for the simulations are b = lO^^sec and 
a = lO^^sec. Figure |6] plots the tightest bounds for the 
error exponents of Epp{r]n) and E*p^{rin) for different 
thresholds of rjn. The COER exponent occurs for rjn = 
and is equal to 0.0945. 



4.5 Discussion 

Above, we derived the optimum passive and active 
detectors for the traffic analysis problem and evaluated 
their performance by finding the Chernoff upperbounds 
of their false error rates. In this section, we use the 
asymptotic relative efficiency (ARE) as a tool to compare 
their detection performances. 

The asymptotic relative efficiency (ARE) is a measure 
for comparing two discrete-time detection schemes. For 
two discrete detection schemes 5*1 and 52 the ARE metric 
is defined as AREs\.S2 = linin^cx) »^2/n-, where n is the 
number of 5i's samples. The ri2 parameter is the smallest 
number of 5*2 samples that results in 52's error rate to 
be smaller than or equal to the error rate of Si (with n 
samples). An ARE metric of ARE s^ ^82 > 1 depicts that 
Si is asymptotically more efficient than 5*2 . Chernoff [ ,24J 
finds the ARE metric of two detectors Si and ^2 using 
their Chernoff error upperbounds as: 



ARE, 



Sl,S2 



E1/E2 



(76) 



where Ei and E2 are the error exponents of the Chernoff 
upperbounds for Si and 5*2 detectors, respectively. 



Using the analysis results from Sections 14.31 and l44l we 
can derive the ARE metric of the optimum passive and 
active detectors for the two traffic models as: 

AREpAssv,ACTv\A = 1.06396/1.06828 « 0.996 (77) 
ARERND,SLCorr\B = 0/0.0945 = (78) 

This asserts that the optimum active detector out- 
performs the optimum passive detector in both traffic 
models A and B (which is intuitively expected from 
information theory). As an important observation, we 
see that the active detector's advantage is very small 
for the traffic model A, however, the active detector 
significantly outperforms the optimum passive detector 
in traffic model B, i.e., the correlated traffic. In other 
words, the active detector provides very good detection 
performance for different traffic models, however, the 
passive detection is very poor for the more correlated 
network traffic. Later in this section, we sh 

In the rest of this section we analyze the performance 
of the SLCorr scheme under the traffic model A, showing 
that even though SLCorr is not the optimum detector 
for the traffic model A, however, it provides very good 
detection performance vinder this model. Based on this, 
we choose SLCorr as the sole detector for RAINBOW, 
regardless of the behavior of the network flows. This 
simplifies the watermark detection, as real-world traf- 
fic are combinations of the models A and B, and the 
detection can be performed regardless of the type of 
the received traffic. We also analyze the performance 
of PASSV and ACTV detectors under traffic model B, 
showing their inefficiency in this model. 

4.5. 1 SLCorr Detection performance for traffic model A 

The SLCorr scheme is the optimum active detector for 
traffic model B, but not the traffic model A. In this 
section we show that SLCorr achieves a good detection 
performance even under traffic model A, allowing a 
system designer to use it as the sole detection scheme 
regardless of the type of the traffic. SLCorr faces the 
following hypothesis testing under the traffic model A: 







(79) 



Considering SLCorr 's detection metric, given in (|59l l 
to |(6T] |, one can rewrite the hypothesis testing problem 
as: 



Hq : y^ = T* +6t- Ti 
Hi ■.yi = Wi + Si 



(80) 



where yi = t[ — Ti. Let us assume /,°(-) and //(•) as the 
PDF functions of yi\Ho and yi\Hi, respectively. We have 
that: 



yi\Hi -- Lap{wi,b) 

til X 1 .\JH^J^ 



(81) 
(82) 
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Also using Lemin^l] in Appendix [B] 



where 



Si ^ Lap{0, b) 
(r* - n) ^ Lap{0, 1/A) 



(83) 


EFp{s,rin) = srjn - 


-Mo(s) 


(s>0) 


(96) 


(84) 


EFN{s,rjn) = Sljn ^ 


-Mo(s) 


(s<0) 


(97) 



f^iy^) = 



bX 



2(1-52A2) \b 



ie-^l^'l-Ae-il'^'l 



(85) 



Now, let us define po(') and pi(-) as the PDF functions 
of Di{yi) under hypotheses Hq and Hi, respectively. We 
derive p{-) as: 



6^A^ 






fcA /'lp-A(_D.+a/2) 

2(l-b2A2) \b'^- 

1 
2 



Po(A) = < 



Also, using l l82|l we derive pi(-) as: 

A = 



D, 



2 ^ ^* ^ 2 






Pi(A) = <! J,e- 



1 

2 



-f < A < f 



(86) 



(87) 



Based on the po{-) and pi (•) distributions and using the 
Chernoff bounds for signal detection (part c of Lemma [l] 
in Appendix |A| we find the error probabilities of the 
detector to be: 



Pfp < e-"(^''"-^°('')) (Vs > 0) 

Mo(s) ^ Hd,\Ho{s) 

Pen < e-"("''"-'^i(''» (Vs < 0) 

Ml(s) = A*£>i|ifi(s) 

where we have: 

/CO 
e'"'po{x)dx 
-OO 

62A2 



(88) 



(89) 



(90) 



= ln 



2(1-62A2) 



62A2(s-A) 



sa/2 ~Xa 

e ' e 



.gSa/2g-a/6 



1-sb 

-2Xbs + 2A + s62a2 - px3 + g26 - s 

(s-A)(s6-l)52A2 ' 



-sa/2 



and. 



fJ-i{s) = /"Di|Hi(s) = In / e'*''pi(x)da; 



= ln 



2(s6+l) 



s6 + 2 
2(s6+l)' 



(91) 

(92) 
(93) 



As before, we can express the above Ppp and Pfn 
false errors as: 



Pfp < e-"--^^^^^'''") 
Pfn < e-"-^^"^"'''"' 



(94) 
(95) 



Finally, the tightest bounds for each rjn are found 
by maximizing the error exponents with respect to the 
parameter s: 



E*Fp{rin) = Tna.yiEFp{s,rin) 

s>0 

E 



FNiVn) = maxSFAf(s,?7r, 
s<0 



(98) 
(99) 



4.5.1.1 Analysis results: We use Mathematica 7.0 to 
evaluate the false error exponents of l|98] | and (|99|l . The 
parameters used for the simulations are b = 10^2sgg^ 
A = 5pps and a = lO^'^sec. Figure [7| plots the tightest 
bounds for the error exponents of E*pp[rin) and E*pj^[rin) 
for different thresholds of rjn- The COER exponent occurs 
for 77„ = 9.6 X lO^s which is equal to 0.0228. Also, 
Figure |8] shows the COER exponent with respect to 
different values of the watermark amplitude, a. As we 
can see, increasing the watermark amplitude improves 
the detection performance (but reduces the watermark 
invisibility as discussed in IITSlI ). 

4.5.2 Detection performance of PASSV and ACTV 
scfiemes for traffic model B 

As derived before, the PASSV and ACTV schemes are 
the optimum passive and active detectors for the traffic 
model A. We show that PASSV and ACTV perform very 
poor under the traffic model B, i.e., the correlated traffic. 
This is unlike the SLCorr detector that works good for 
both of the traffic models. 

Under the traffic model B, the PASSV detector faces 
the hypothesis testing problem of | |46)| with t* ~ ti — Ci. 
One can see that in this case the PASSV detection rule 
described in Section 14.3.1 1 is exactly the same for both HO 
and H\ hypotheses. This means that the false positive 
error rate of PASSV scheme for correlated flows is equal 
to its true positive rate, which makes the PASSV scheme 
equivalent to a random guessing detector. Similarly, for 
the traffic model B the ACTV scheme deals with the 
hypothesis testing problem of l ISTJ I with t* — n — d. 
Our analysis and simulations on Mathematica confirms 
that the ACTV detection metric results in very close 
values for the two hypothesis of HQ and HI, rendering 
the ACTV detection scheme ineffective for network flows 
in traffic model B (we skip the details due to the space 
constraints). 

5 Simulation Results 

In this section, we evaluate the performance of the 
three detection schemes introduced before, i.e., SLCorr, 
ACTV, and PASSV, through simulating them over real- 
world traffic. We show that SLCorr outperforms the 
other detectors dealing with real-world network flows, 
due to the intrinsic correlations among the real-world 
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Fig. 8. The COER error exponent of SLCorr in traffic model 
A for different watermark amplitudes. 



network flows. We use the CAIDA network traces gath- 
ered January 2009 [25] for our simulations. For our sim- 
ulations, we have implemented the detection schemes 
in C++. From the CAIDA traces we extract three types 
of network flows for our simulations: TCP ports of 
443 (HTTPS), 25 (SMTP), and 22 (SSH). We only select 
flows with rates lower than 30pps (this is because the 
parameters of the optimum detectors depend on the rate 
of the flows). In all of the simulations, the detectors use 
the detection thresholds derived through analysis in the 
previous sections, i.e., 0.001 for SLCorr, for ACTV, and 
for PASSV. 

In the first set of our simulations, we evaluate the 
false positive error rate of the three detection schemes 
for network flows mentioned above. For each detection 
scheme, we run the detection algorithm for 10000 differ- 
ent pairs of network flows. In order to show the effect 
of number of packets in the detection performance, we 
run the experiments for four different values of the A^ 
parameter, i.e., 25, 50, 100, and 200. Tables [H El and |3] 
show the false positive rates of the experiments along 
with some statistics on the detection metrics for three 
TCP ports of 443, 25, and 22, respectively. Results show 
that in most of the cases the SLCorr scheme results in 
smaller false positive errors compared to the ACTV and 
PASSV schemes. This is because the real network flows 
are deviated from the Poisson model of the traffic, due 
to the intrinsic dependencies among the packets of real 
network flows. The SLCorr detector, on the other hand, 
is the optimum detector for correlated network flows, 
which also results in reasonable detection performance 
for Poisson-modeled network flows. Comparing the re- 
sults for the three different traffic types (Tables [ij |2l 
and |3]l, we observe that the ACTV and PASSV schemes 
perform the worst for the SSH traffic (TCP port 22); 
we explain this by the fact that SSH flows are more 
correlated compared to HTTPS and SMTP flows, as they 
are based on the typing behaviors of the human entities. 
Another general observation from the simulations is that 



the detection performance improves as the number of 
packets, N, increases. 

In the second set of experiments, we run the simulated 
detection schemes to measure the false negative error 
rates. Again, we use the detection thresholds derived 
through the analysis in previous sections. In each sim- 
ulation of the SLCorr and ACTV schemes, the candi- 
date network flow is watermarked using the RAINBOW 
scheme (Section [Sjl and then a network delay is ran- 
domly selected and applied to that flow from a large 
pool of network delays measured over the Planetlab 
infrastructure fTSl (the average standard deviation of 
the network delay is around 10ms). Likewise, for the 
PASSV simulations the candidate network flow is de- 
layed similarly to simulate the network interference. The 
delayed flow is then correlated with the original flow 
(non-delayed, and non-watermarked) using each of the 
detection schemes. Tables ID |5l and [6] show the false neg- 
ative of the experiments for the three different detection 
schemes, evaluated for three different TCP ports. For the 
watermark detection schemes of SLCorr and ACTV the 
experiments are repeated for four different values of the 
watermark amplitude, i.e., a = IOttis, 15tos, 20ms, 30ms. 
Also, all of the simulations are run for different values of 
the watermark length, N. Results show that by choosing 
reasonable parameters for the RAINBOW watermark, 
the SLCorr and ACTV detection schemes result in very 
small false negative rates, comparable to those of the pas- 
sive detection. Again, we see that increasing N improves 
the detection performance. 

In the third set of experiments, we evaluate the false 
positive error rate of the three detection schemes over 
highly correlated network flows. More specifically, we 
use flow traces corresponding to web browsing activi- 
ties of human entities that target the same destination 
websites at different times and from different network 
locationq5 Table [7| shows the false positive error rates 

2. The traces are generated and provided to us by Xun Gong from 
UIUC 
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TABLE 1 

False positive rate of different detection sclnemes for port 

443 network flows. Each experiment is run for 10000 

different pairs of flows. 



N 


Scheme 


Detection metric 


False 
Positive 


Min 


Avg 


Max 


25 


SLCorr 
ACTV 
PASSV 


-0.005 

-457.385 
-245.249 


-0.0031 
-37.8203 
-35.6167 


0.00012 
14.1698 
2.8426 


0.0068 
0.0151 
0.0054 


50 


SLCorr 
ACTV 
PASSV 


-0.005 
-503.655 
-567.917 


-0.0039 
-36.8637 
-45.5303 


0.0012 
3.9970 
2.8297 


0.0002 
0.0159 
0.0009 


100 


SLCorr 
ACTV 
PASSV 


-0.005 

-515.555 
-555.857 


-0.0042 
-33.2478 
-44.0783 


-0.0004 
-2.2095 
2.9567 






0.0023 


200 


SLCorr 
ACTV 
PASSV 


-0.005 
-608.838 
-559.164 


-0.0042 
-33.5721 
-43.2514 


-2.5E-5 
0.9735 
2.9535 



0.0005 
0.0018 



TABLE 2 

False positive rate of different detection schemes for port 

25 network flows. Each experiment is run for 10000 

different pairs of flows. 



N 


Scheme 


Detection metric 


False 
Positive 


Min 


Avg 


Max 


25 


SLCorr 
ACTV 
PASSV 


-0.005 
-461.182 
-364.275 


-0.0039 
-50.3404 
-49.6125 


0.0018 
6.1398 
1.8952 


0.0008 
0.0003 
0.003 


50 


SLCorr 
ACTV 
PASSV 


-0.005 
-359.413 
-364.652 


-0.0042 
-35.2567 
-53.7937 


0.0004 
-0.3314 
1.5171 






0.0015 


100 


SLCorr 
ACTV 
PASSV 


-0.005 
-352.581 
-368.304 


-0.0037 
-31.3738 
-55.4709 


-0.0007 
0.0420 
1.4271 



0.0001 
0.0013 


200 


SLCorr 
ACTV 
PASSV 


-0.005 
-190.366 
-375.012 


-0.0041 
-29.6399 
-56.3069 


-0.0014 
-1.2917 
1.3936 






0.0012 



TABLE 3 TABLE 4 

False positive rate of different detection schemes for port False negative rate of different detection schemes for port 

22 network flows. Each experiment is run for 10000 443 network flows. Each experiment is run for 10000 

different pairs of flows. different pairs of flows. 



N 


Scheme 


Detection metric 


False 
Positive 


Min 


Avg 


Max 


25 


SLCorr 
ACTV 
PASSV 


-0.005 
-495.125 
-88.1381 


-0.0029 
-18.3825 
-8.7786 


0.0026 
6.8506 
3.3239 


0.0024 
0.0269 
0.1031 


50 


SLCorr 
ACTV 
PASSV 


-0.005 
-628.45 
-80.5081 


-0.0038 
-20.1249 
-9.3516 


0.0011 
4.5654 
3.3204 


0.0001 
0.0144 
0.0879 


100 


SLCorr 
ACTV 
PASSV 


-0.005 
-522.241 
-101.337 


-0.0037 
-23.434 
-9.8241 


0.0005 
2.8119 
3.3202 



0.0142 
0.0861 


200 


SLCorr 
ACTV 
PASSV 


-0.005 
-487.594 
-104.547 


-0.0039 
-26.357 
-9.7138 


1.67E-5 
4.7264 
3.3195 



0.0212 
0.0896 



N 


Scheme 


False Negative | 


10 ms 


15 ms 


20 ms 


30 ms 


25 


SLCorr 
ACTV 
PASSV 


0.039 
lE-04 


0.005 
lE-04 


0.0004 



0.0003 
0.0004 


0.0002 1 


50 


SLCorr 
ACTV 
PASSV 


0.0137 



0.0004 











1 


100 


SLCorr 
ACTV 
PASSV 


0.0028 















1 


200 


SLCorr 
ACTV 
PASSV 


0.000977 















1 



TABLE 5 TABLE 6 

False negative rate of different detection schemes for port False negative rate of different detection schemes for port 

25 network flows. Each experiment is run for 10000 22 network flows. Each experiment is run for 10000 

different pairs of flows. different pairs of flows. 



N 


Scheme 


False Negative | 


10 ms 


15 ms 


20 ms 


30 ms 


25 


SLCorr 
ACTV 
PASSV 


0.0346 
0.0003 


0.0035 
0.0002 


0.0007 
0.0004 



0.0002 


0.0001 1 


50 


SLCorr 
ACTV 
PASSV 


0.0154 



0.0005 



0.0003 




0.0006 


1 


100 


SLCorr 
ACTV 
PASSV 


0.002636 















1 


200 


SLCorr 
ACTV 
PASSV 


















1 



N 


Scheme 


False Negative 1 


10 ms 


15 ms 


20 ms 


30 ms 


25 


SLCorr 
ACTV 
PASSV 


0.028879 



0.001775 




0.00062 



0.005727 


0.0002 1 


50 


SLCorr 
ACTV 
PASSV 


0.009671 















1 


100 


SLCorr 
ACTV 
PASSV 


















1 


200 


SLCorr 
ACTV 
PASSV 


















1 
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for different detection schemes for different websites and 
for different values of N (each simulation is averaged 
over 100 runs). As can be seen, in most of the case, the 
ACTV and PASSV detection schemes result in very high 
false positive rates, while the SLCorr scheme results in 
no false positive error in all of the cases. This confirms what 
we expect intuitively: the PASSV and ACTV scheme are 
optimum passive and active detection schemes for independent 
network traffic models, hut they perform poorly as the network 
flows get more correlated. The SLCorr scheme, however, is 
the optimum detection scheme for correlated network 
flows, and it also performs good enough in the case of 
independent network flows. 

6 Conclusions 

In this paper, we introduce the first non-blind active traf- 
fic analysis scheme, RAINBOW. Using the tools from the 
detection and estimation theory, we find the optimum 
passive and (non-blind) active traffic analysis schemes 
for different types of the network flows. We show that, 
for different traffic models, the optimum active detectors 
outperform the optimum passive detectors. This advan- 
tage is more significant for the more correlated network 
traffic, e.g., the web browsing traffic. Considering the 
fact that both passive and non-blind active approaches 
of traffic analysis are constrained by similar scalability 
issues, this finding motivated the use of non-blind active 
approaches over the passive approaches. 

Appendix A 
Chernoff bounds 

Lemma 1 (Chernoff hound for signal detection): 
Consider the following binary hypothesis testing 
for signal detection: 



Hq ■■ Vt ^ PoAVi) i ^ l,...,n 
Hi : Vi ^Pi.iiVi) i = l,...,n 



(100) 



For this hypothesis testing consider a detection scheme 
with rule: 

such that r(y) - Er=i T^^yO- 

We are interested in finding the false positive rate 
Ppp — Pr{T(y) > f]} and the false negative rate 
PpN = P'''{T{y) < rj} of this detector in different cases. 
We have that: 

a) General case: 



Ppp < e-(''"-^<5'('^)) (s > 0) 

Pfn < e-("'^-^^('^» (s < 0) 



(101) 
(102) 



where /i^(s) is the cumulant generating function (CGF) 
of T(-) under hypothesis Hk- 
b) Independent Ti(-)'s: We have that: 



T 



(^) = Ea^?(^) 



where k corresponds to hypothesis Hk- This results in 
the error rates to be: 

n 

Pfp < []e-("''/"-^^'("»(Vs > 0) (103) 

i=l 
n 

Pfn < J|e-('"'/"-^i^'("»(Vs < 0) (104) 

i=l 

For T,{yi) = H^f^l this reduces to 



Pfp<Y[< 



-(sr;/n-/Jo,i(s)) 



i=l 



Pfn<Y[> 



-n((s-l)r;/n-/Jo,i(s)) 



(0 < s < 1) 



(105) 



(106) 



where: 



/io,»(s) = In / Po.t' iy)pUiy)dy 



(107) 



T,(A - 



c) i.i.d. li(-)'s: For any i and j we have that fJ-^'{s) 
f^k (*) = Mfc^ (*)' which reduces the false error rates to: 

Ppp < e-"('"'/"-^^'('s)) (Vs > 0) (108) 

Pfn < e-"('"'/"-^i^'('')) (Vs < 0) (109) 

For T,{y,) = lii[ ^^-'[j^'| ], this reduces to 

Ppp < e^^('^v/n~fio{s)) 

p < g-"((s-l)»)/n-A'o(s)) 



(110) 
(111) 



(0 < s < 1) 



where: 



Aio(s) = In / p^^\y)p\Ay)dy 



(112) 



Appendix B 

Summation of random variables 

Lemma 2 (Summation of two Laplacian random variables): 
Suppose that we have two independent random 
variables distributed according to Laplacian distribution 
as X ~ Lap{0, 1/a) and Y ^ Lap{0, 1//3) where a y^ /3. 
The PDF function of the summation of these random 
variables, Z = X + Y, is given by: 



li a — /3 then: 



/3e 



-a\z\ 



(113) 



(114) 



Proof: Using the convolution of PDFs: 

fz{z) = ifx * fY){z) 



n 

Lemma 3 (Summation of Laplacian and Exponential r.v.s): 
Suppose that X ^ Exp{\), and Y ^ Lap{0,b). The 
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TABLE 7 
False positive error rate of different detection scinemes for network flows generated by browsing the same websites. 







N=25 






N=50 






N=100 




Website 




















SLCorr 


ACTV 


PASSV 


SLCorr 


ACTV 


PASSV 


SLCorr 


ACTV 


PASSV 


baidu.com 





0.08 


0.29 





0.12 


0.07 





0.12 


0.08 


blogger.com 





0.56 


0.97 





0.89 


0.63 





0.34 


1 


facebook.com 





0.95 


0.91 





0.9 


0.97 





0.59 


0.96 


live.com 





0.81 


1 





0.33 


1 





0.08 


0.38 


wikipedia.org 





0.44 


0.94 





0.44 


0.44 





0.39 


0.46 


yahoo.co.jp 





0.08 


0.66 





0.03 


0.33 








0.05 


yahoo.com 





1 


1 





0.02 


1 








0.23 


yandex.com 





0.11 


0.89 





0.02 


0.08 








0.02 



random variable Z = X + Y has the following 
distribution: 



fziz) 



2(A5^e-f + ^3^e-^^ z>0 



2(Ah+l) 



eb 



z < 



Also, for a fixed integer rn, the random variable T 
Z - TO has the PDF: 

fT{t) = fz{t + m) 

We abbreviate this as: 

/^^(i,TO,A,6)=/TW 

Proof: Using the convolution of PDFs: 

fziz) = {.fx*fY)iz) 



n 
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